Combining variable latency pipeline with instruction reuse for execution latency reduction

نویسندگان

  • Toshinori Sato
  • Itsujiro Arita
چکیده

Operand bypass logic is likely to be one of the critical structures for future microprocessors to achieve high clock speed. The logic delay causes the execution time budget to be reduced significantly, so that the execution stage is divided into several stages. The variable latency pipeline (VLP) structure has the advantages of pipelining and pseudo-asynchronous design. According to the source operands delivered to arithmetic units, VLP changes execution latency and thus achieves both high speed and low latency for most operands. In this paper we evaluate VLP with dynamically scheduled superscalar processors, using a cycle-by-cycle simulator. Our experimental results show that VLP successfully reduces the effective execution time, and thus relaxes the constraints on the operand bypass logic. We also evaluate the instruction reuse technique in order to support VLP. © 2003 Wiley Periodicals, Inc. Syst Comp Jpn, 34(12): 11–21, 2003; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ scj.10498

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modulo Scheduling with Cache Reuse Information

Instruction scheduling in general, and software pipelining in particular face the di cult task of scheduling operations in the presence of uncertain latencies. The largest contributor to these uncertain latencies is the use of cache memories required to provide adequate memory access speed in modern processors. Scheduling for instruction-level parallel architectures with nonblocking caches usua...

متن کامل

The Performance Potential of Data Value Reuse

This paper presents a study of the performance limits of data value reuse. Two types of data value reuse are considered: instruction-level reuse and trace-level reuse. The former reuses instances of single instructions whereas the latter reuses sequences of instructions as an atomic unit. Two different scenarios are considered: an infinite resource machine and a machine with a limited instructi...

متن کامل

Variable Latency Caches for Nanoscale Processors

Variability is one of the important issues in nanoscale processors. Due to increasing importance of interconnect structures in submicron technologies, the physical location and phenomena such as coupling have an increasing impact on the latency of operations. Therefore, traditional view of rigid access latencies to components will result in suboptimal architectures. In this paper, we devise a c...

متن کامل

Accurate analysis of memory latencies for WCET estimation

These last years, many researchers have proposed solutions to estimate the Worst-Case Execution Time of a critical application when it is run on modern hardware. Several schemes commonly implemented to improve performance have been considered so far in the context of static WCET analysis: pipelines, instruction caches, dynamic branch predictors, execution cores supporting out-of-order execution...

متن کامل

Multipldpass Pipelining: Enhancing In-order Microarchitectures to Out-of-order Performance

Out-of-program-order execution has become almost a ubiquitous characteristic of modern processors because of its ability to tolerate variable memory-instruction latency. As designs are becoming increasingly power-conscious, the cost and complexity of the components of out-of-order execution are becoming problematic. Compilers have generally proven adept at planning useful static instruction-lev...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Systems and Computers in Japan

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2003